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(57) Abstract 

A method and apparatus for dynamically shifting between switching and routing packets efficiently to provide high packet throughput 
The present invention designates multiple ports of an IP switched router for communication as a single interface with a second adjacent It 
switched router to provide increased bandwidth between the network interfaces on the IP switched routers. The multiple designated pon; 
are monitored by both IP switched routers for communications from the other. Data flows are queued up and inverse multiplexed over tht 
multiple ports to optimize the available bandwidth. The inverse multiplexing is done at layer 2 of the OSI reference model, so that layer .- 
does not have to know about any reallocation among the multiple ports. 
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MULTDPORT INTERFACES FOR A NETWORK USING INVERSE 
5 MULTIPLEXED IP SWITCHED FLOWS 

CROSS-REFERENCE TO RELATED APPLICATIONS 
This application claims priority from commonly-owned U.S. provisional 
patent application no. 60/030,348 filed on November 6, 1996, the disclosure of which is 
10 herein incorporated by reference for all purposes. This application is also a continuation- 
in-part application of commonly-assigned U.S. patent application no. 08/792,183 filed on 
January 30, 1997 which is a continuation-in-part of U.S. patent no. 08/597,520 filed on 
January 31, 1996, the disclosures of which are herein incorporated by reference for all 
purposes . 

15 

COPYRIGHT NOTICE 
A portion of the disclosure of this patent document contains material which 
is subject to copyright protection. The copyright owner has no objection to the facsimile 
reproduction by anyone of the patent document or the patent disclosure, as it appears in the 
20 Patent and Trademark Office patent file or records, but otherwise reserves all copyright 
rights whatsoever. 

BACKGROUND OF THE INVENTION 
The present invention relates to the field of network communications. In 
25 particular, a specific embodiment of the invention relates to improving the bandwidth of 
communications between the network interfaces on two adjacent "IP switched routers." 

Local area network (LAN) switches have been conventionally used as a 
quick, relatively inexpensive way to relieve congestion on shared-media LAN segments to 
more effectively manage traffic and allocate bandwidth within a LAN than shared-media 
30 hubs or simple bridges. LAN switches operate as datalink layer (layer 2 of the OSI 

reference model) packet-forwarding hardware engines, dealing with media access control 
(MAC) addresses and performing simple table look-up functions. Although switch-based 
networks are able to offer greater throughput, they continue to suffer from problems such 
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as broadcast flooding and poor security. Routers, which operate a. the network-iayer 
(layer 3 of the OS. reference mode.), are so,, required t0 so , ve mcse ^ „, 
However, fast switching technology „ overwhellning fc rf ^ ' 

T*.""** Tte fl> pacte-forwarding device on which Ute ' 

Internet is based, the IP router, is showing signs of ^» 3 cy. In addition, routers are 

often expensive, comp,ex, and of Umfted throughput, as cotnpared to emerging switching 
techn 0l ogy. To support the increased traffic demand of large enterprise-wide networks 
and the Internet, IP routers need to operate faster and cos, less. In current routers 
tnultiple paths are available between two routers and a router may switch data between one 
or more paths by making a decision a. layer 3 of the OSI reference model These 
traditional routers achieve throughput in the hundred, of thousands of packets-per-second 
range. However, as the need for even greater throughput increases and advanced 
mnctionahties required by more types of traffic are enab.ed in IP, traditiona , IP routers 
wtll no, suffice as packet-forwarding devices, especially since these routers are often 
limited by their processor-intensive designs. 

From the above, it is seen that another approach for avoiding bottlenecks 
and increasing packet throughput between nodes is needed. 



SUMMARY OF THE INVENTION 
The present invention designates multiple ports of an IP switched router for 
communication as a single interface with a second adjacent IP switched router to provide 
increased bandwidth between the network interfaces on the IP switched routers The 
multiple designated ports are monitored by both IP switched routers for communications 
from the other. Data flows are queued up and inverse multiplexed over the multiple ports 
to optimize the available bandwidth. The inverse multiplexing is done at layer 2 of the 
OSI reference model, so that layer 3 does not have to know about any reallocation among 
the multiple ports. 

According to an embodiment, die present invention provides a method for 
transmuting packets over a multiport interface between an upstream node and a 
downstream node in a network, where the downstream node is downstream from the 
upstream node. The method mcludes the steps of establishing a multipon interface that 
mciudes muhipie sub-pom between the upstream node and the downstream node 
recetvmg a packet a, the downstream node, and perfomung a flow classification at the 
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downstream node on the packet to determine whether the packet belongs to a specified 
flow that should be redirected in the upstream node to the multiport interface. The method 
also includes the steps of selecting a free label for one of the multiple sub-ports at the 
downstream node, and informing the upstream node that future packets belonging to the 
specified flow should be sent with the selected free label attached. 

According to another embodiment, the present invention provides a 
computer program product that enables dynamic shifting between routing and switching in 
a network having an upstream node and a downstream node. The downstream node is 
downstream from the upstream node. The computer program product includes computer- 
readable code that establishes a multiport interface which includes multiple sub-ports 
between the upstream node and the downstream node, and computer-readable code that 
performs a flow classification on a packet at the downstream node to determine whether 
the packet belongs to a specified flow that should be redirected in the upstream node to the 
multiport interface. The computer program product also includes computer-readable code 
that selects a free label for one of the multiple sub-ports at the downstream node, 
computer-readable code that informs the upstream node that future packets belonging to the 
specified flow should be sent with the selected first free label attached, and a tangible 
medium that stores the computer-readable codes. 

These and other embodiments of the present invention, as well as its 
advantages and features, are described in more detail in conjunction with the text below 
and attached figures . 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a diagram illustrating multiple port connections between two "IP 
switched routers," in accordance with a specific embodiment of the present invention; 

Fig. 2 is a diagram illustrating the queuing of multiple flows for the multiple 
ports, in accordance with the specific embodiment of the present invention; 

Fig. 3 illustrates one of the many network configurations possible in 
accordance with the present invention; 

Fig. 4a is a system block diagram of a typical computer system 151 that 
may be used as switch controller 12a in basic switching unit 12 (as shown in Fig. 1) to 
execute a specific embodiment of the system software of the present invention; 

Fig. 4b is a general block diagram of an architecture of an ATM switch 3 
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(the example shows a .6-pon switch, to may he used as the switching hardware en*i„ r 
a hastc switching unit according to an embodiment of the present invention- ? " 

Ftg. 5a is a simplified diagram generally illustrating the iruriaUzation 
procedure m each system node according to an embodiment o f the present invent 

Rg. 5b is a simplified diagram that generaUy illustrates the operation of a 
system node according to an embodiment of the present invention; 

Fig. 6a is a diagram generally ifiustrating the steps involved in fcbelHng a 
flow ,„ a system node according to an embodiment of the present rnvention 

F,g. 6b ts a diagram generaUy Heating the steps involved in switch™ a 
flow ,„ a bas,c switching unit according to an embodiment of the present invention 

Ftg. 6c ts a diagram generally Ulustranng the steps involved in forward™ a 
packet m a system node according to an embodiment of the present invention 

Ftg. 7a is a simplified diagram generaUy illustrating the mtUfipor, interface 
Z " " 2 SWitCMnS ^ — <° - — ^f the prtl 

Fig. 7b is a diagram generally illustrating some of the steps involved in 
detetmnnng whether a flow shouid be switched in bas.c switching urn. accords 0 an 
embodiment of the present invention: 

flow ■ ,„ Ff8 ' ?C " 2 diaSranl 8enera,ly iUm ™ ing 51605 in ^belling a 

flow m the upsneam IirJc for . designated ^ ^ fa § ^ « 

such as shown by label flow Qt^r» ^ 

P 66 ° f Flg - 7b 3CCOrding » 3n - - P-nt 

the o f • /!' ?d " 3 SimPliflCd dfag " m *" gCnerally illUStrates — of the steps of 
the operation of the basic switching unit according m th. . •« 

S umi accor <Hng to the specific embodiment of the 

present invention. 
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DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

CONTENTS 

I. General 

A. Inverse Multiplexed Transmission of Flow Labelled Packets 

B. Flow Classification, IFMP, GSMP in the Specific Embodiment 

II. System Hardware 

A. Controller Hardware 

B. Switching Hardware 

EEL System Software Functionality 

A. Configuration of Multiport Interfaces 

B. Flow Distribution on Multiport Interfaces 
IV. Conclusion 

I. General 

The present invention provides for a multiport interface made of two or 
more sub-ports used as a single interface, with flow-by-flow inverse multiplexing, to 
provide at layer 2 of the OSI reference model very high speed tninkrng capability between 
"IP switched routers," also referred to as "basic switching units." The multiport interface 
increases the effective bandwidth for transmitting packets in a network. The method and 
apparatus will find particular utility and is illustrated herein as it is applied in the high 
throughput flow-based transmission of IP packets capable of carrying voice, video, and 
data signals over a local area network (LAN), metropolitan area networks (MAN), wide 
area network (WAN), Internet, or the like, but the invention is not so limited. The 
invention will find use in a wide variety of applications where it is desired to transmit 
packets over a network. 

A. Inverse Multiplexed Transmission of Flow Labelled Packets 

In accordance with specific embodiments of the present invention, Fig. 1 
illustrates two switch controllers 12a and 14a, respectively coupled to a switching engine 
12b and 14b. Each corresponding pair of switch controller and switching engine form 
what is referred to as an "IP switched router" or a "basic switching unit" (basic switching 
unit 12 includes switch controller 12a and switching engine 12b, and basic switching unit 
14 includes switch controller 14a and switching engine 14b), a specific embodiment of 
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wtach > described in further derail below. Although referred ro as a " S wi tt hi„g- ^ h 
should be recognized rha. rhe basie switching uni, of rhe sysrem via sysrem software ' 
ta-IW on its switch comroller dynaoncaUy provides both iayer 2 swftching fnnctionaiiry 
. we„ as ,ayer 3 routing and packer forwarding fmaionaUv , b ^ J 
5 the swnchmg engme. which utilizes convenrionai and currendy avaiiabie asynchronous ' 
hansfer mode (ATM) switching hardware, is an ATM switch. The ATM swirching " 
hardware providing the swirching engine of the basic switching unit operates a, the dataib* 

TaT I T refcrcnCe m ° de,) ' °' ^ — with 

me ATM sw,tch rha, is above the ATM Adaption Uyer rype 5 (AAL-5) is comp,e,e,y 

1 0 removed. Thus, the signaiiing, any existing routing protocol, and any LAN emuiation 
server or address resolution servers, etc. are removed. Of course, other swirching 
technologies such as for example fas, pack e, switching, frame relay, 100BaseT Fas. 
Entente,, Gigabi, Eureme,, Fiber Disuibufcd Data taerface (FDD!) or others may be used 
to provide the swirching engine of the basic swirching unit, depending on me application. 
1 5 The swnch conn-oiler is a computer having muWple network ^ Qf ^ 
cards (NICs) connected to the swirching engine via muftipor, interface 18. Sysrem 
software is dialled in basic swirching uni., more particularly in One compmer serving as 
swnch conn-oller. The switching engure serves .o perform high-speed swirching (unctions 
when required by tire basic switching uni,, as decrmined by tire sy SK m software The 
20 switching capability of the switching sys<em is linked only by the hardware used in tire 
swnching engine. Accordingly, che presen, embodimen, of me invention is able ,o <ake 
advance of the high-speed, high capacity, high bandwidti, capabilities of ATM 
■echnology. In addition <o performing srandard connectionless IP ron.ing Amotions a. layer 
3, tire switch contioller also make, flow classification decisions for packets on a local 
25 basis, as described generally below. 

In accordance with a specific embodimen. of the presen, invention, each of 
these basic switching unns can also communicate with otiaer nodes in a network or ottrer 
networks or servers via ports 16, for example. In one possible network configuration, a 
■runic imerface between the two basic swnching uni,s may be used. The swftohJng engine 
1 0 of each basic switching uni, has multiple physical pons . each being capable of being 

conneced ,o a variery of devices, including for example dara .ermmal equipmen, (DTE) 

data communication equipment (DCE) servers ™itrh* c 

* y c; ' servers < switches, gateways, etc. One of these 

multiple ports, for example port 1 is used tn nmviH* ,t.~ 

v v . used to provide the communication link between the 
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switch controller and the switching engine. Two or more of these multiple ports may be 
used as a trunk interface to form a single multiport interface 18. For example, ports 8,9, 
10 and 12 of the basic switching unit 12 may be designated to be sub-ports of multiport 
interface 18, as described further below. Multiport interface 18 is created by combining 
several ports of the switching engine of a basic switching unit and having it appear as a 
single interface to the switch controller of the basic switching unit. 

Fig . 2 illustrates an example of flow-by-flow inverse multiplexing across the 
multiple sub-ports of multiport interface 18. By way of illustration, flows 1-12 are shown 
being allocated to different queues for ports 8, 9, 10 and 12 of switching engine 12b. 
More specifically in this example, flows 4, 5 and 8 have been allocated to sub-port 8; 
flows 3, 9 and 11 have been allocated to sub-port 9; flows 2 and 7 have been allocated to 
sub-port 10; and flows 1,6, 10 and 12 have been allocated to sub-port 12. The bandwidth 
can be maximized by evenly spreading the flows across the multiple sub-ports to the extent 
possible. 

In addition, there is an optimization between (a) the multiport interface or 
trunk 18 between switch controllers 12 and 14, and (b) the multiple connections to other 
nodes. If the bandwidth required on other ports 16 to other nodes increases, one of the 
ports allocated to multiport interface 18 could be reallocated to communicate with another 
node. Conversely, if the bandwidth of traffic to other nodes decreases, more ports could 
be allocated to the multiport trunk interface 18 between switch controllers 12 and 14. 

With specific embodiments of the present invention, various network 
configurations may be implemented to provide end-to-end seamless IP traffic flow, with 
the network configurations featuring high bandwidth and high throughput between network 
interfaces on basic switching units 12 and 14 via the flow-by-flow inverse multiplexing 
over multiport interface 18 established between basic switching units 12 and 14. For 
example, Fig. 3 illustrates one of the many network configurations possible in accordance 
with the present invention. Of course, many alternate configurations are possible. In one 
embodiment, multiport interface 18 could be used between basic switching units 12 and 14 
as shown in Fig. 3, according to the present invention. Basic switching units, switch 
gateway units , and system software allow users to build flexible IP network topologies 
targeted at the workgroup, campus, and WAN environments for high performance, 
scaleable solution to current campus backbone congestion problems. 

More specifically, Fig. 3 shows a simplified diagram of a high performance 
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workgroup environment in which sever,, host computers 145 are connected via ATM links 
133. to tnuhiple basic switching units 12 and 14. which each connect to a switch ttatewav 
unit 12. that connects LAN 135 with user devices ,41. In tins configuration^ fir* 
asrc switching unit 12 connects to a second basic switching unit .4 via tnultiport interface 
■ 18, such as seen in Fig. 1. Multip.e hos, computers 145 connect to the firs, basic 

switching uni, 12 via respective 155 Mbps ATM links 133, (where x = 2 ,o 5) throw* 
respective ATM N,Cs U7. In addition, multip.e hos, computers .45 connect to tire 
second basic switching uni, 14 via respective 25 Mbps ATM links .33, (where y = 8 to 

10) through respective ATM NICs .49. As discussed ahnv, w, 

™ uiscusseo above, host computers 145 equioDed 

wtth ATM NICs are instaUed with a subset of the system software, enacting me TCP/IP 
hosts ,0 connect directly to a basic switching unit. The firs, and second basic switching 
untts 12 and 14 connect to switch gateway uni, 121 via ATM links 133s (155 Mbps, and 
133, (25 Mbps) respectively. Connection of the firs, and second basic switching units 12 
and 14 to switch gateway unit 121 via an Entente, (e.g., 10B aseT) or FDDI link 139 
enab.es users of hos. computers 145 to communicate with users devtces ,41 attached to 
LAN ,35. User devices ,41 may be PCs , Knninals , or workstatiom ^ 
NICs 143 to connect to any Ethernet or FDDI LAN 135. The workaroup of host 
computers is thereby seamlessly integrated with the rest of me campus network. 

It is noted mat a "switch gateway unit. ■ which is similar to a basic switching 
uni, without a switching engine, includes a gateway switch controtier and IFMP software 
mstidled on me gateway switch controller, in accordance with a specific embodiment The 
gateway switch controller includes multiple network adaptors or NICs, artd an ATM NIC 
Switch gateway Unit serves as an access device ,„ enabte connection of existing LAN and 
backbone environments to a network of basic switching unto. Accordingly, the NICs of 
me switch gateway uni, may be of different types, such as for exampie Ethernet NICs Fas, 
EThemet NICs, FDDI NICs, and others, or any combination of the preceding. Of course 
me use of particmar types of NICs depends on rhe types of existing LAN and backbone 
envrronments to which switch gateway uni. provides access. It is recognized that mu,ri P ,e 
LANs may be connected to a switch gateway unit. The ATM NIC allows the switch 
gateway uni, ro connect via an ATM ,mk to a basic sw.tching unir. Of course, others of 
the other muhip.e NICs may also be ATM N,Cs ,o provide a connection from me switch 
gateway uni, to another switch gateway. ,„ addition to basic switching units and sw.tcn 
gateway unite, networks utilizing me present tnvention a!so may also include hiah 
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performance host computers, workstations, or servers that are appropriately equipped. In 
particular, a subset of the IFMP software can be installed on a host computer, workstation, 
or server equipped with an appropriate ATM NIC to enable a host to connect directly to a 
basic switching unit. 

According to specific embodiments of the present invention, system 
software on a switch controller of a basic switching unit can create and delete multiport 
interfaces and then direct the switching engine to switch flows on the multiport interface to 
implement the inverse multiplexing of flows over multiport interface 1 8 and adds complete 
IP routing functionality on top of ATM switching hardware instead of any existing 
conventional ATM switch control software, to control the ATM switch such that the flows 
are appropriately multiplexed over the sub-ports of interface 18. Therefore, the present 
system is capable of moving between network layer IP routing when needed and high 
throughput datalink layer flow switching over interface 18 when possible in order to create 
high speed and capacity packet transmission in an efficient manner without the problem of 
router bottlenecks. Using a multiport interface 18 between adjacent IP switched routers, 
the packet throughput between their attached network interfaces may reach millions of IP 
packets -per-second, which is an order of magnitude faster than with traditional IP routers. 

B. Flow Classification, IFMP, GSMP in the Specific Embodiment 
20 Using the Ipsilon Flow Management Protocol (IFMP), which is described in 

further detail in commonly-assigned U.S. patent application no. 08/597,520, a system node 
(such as a basic switching unit, switch gateway unit, or host computer/server/workstation) 
can classify IP packets as belonging to a "flow" of similar packets based on certain 
common characteristics. A flow is a sequence of packets sent from a particular source to a 
2 5 particular (unicast or multicast) destination that are related in terms of their routing and 
any local handling policy they may require. The present invention efficiently permits 
different types of flows to be handled differently, depending on the type of flow, and 
enables the inverse multiplexing of different flows over different sub-ports of multiport 
interface 18, depending on the bandwidth available on each designated sub-port. Some 
30 types of flows may be handled by mapping them into individual ATM connections using 
the ATM switching engine to perform high speed switching of the packets over multiport 
interface 18. Flows such as for example those carrying real-time traffic, those with quality 
of service requirements, or those likely to have a long holding time, may be configured to 

SUBSTITUTE SHEET (RULE 26) 



10 



15 



10 



15 



20 



25 




30 



WO 99/238 m 

PCTAJS98/23535 

10 

be switched whenever possible. Other types of flows, such as for example short duration 
flows or database queries, may be handled by connectionless IP routing. A particular 
flow of packets may be associated with a particular ATM label (i.e. , a virtual path 
identifier (VPI) and virtual channel identifier (VCD). It is assumed that virtual channels 
are unidirectional so an ATM label of an incoming direction of each link is owned by the 
input port to which it is connected. Each direction of transmission on a link is treated " 
separately. Of course, flows travelling in each direction are handled by the system 
separately but in a similar manner. 

Flow classification is a local policy decision. When an IP packet is received 
by a system node, the system node transmits the IP packet via the default channel The 
node also classifies the IP packet as belonging to a particular flow, and accordingly decides 
whether future packets belonging to the same flow should preferably be switched directly 
m the ATM switching engine or continue to be forwarded hop-by-hop by the router 
software in the node. If a decision to switch a flow of packets is made, the flow must fi rst 
be labelled. To label a flow, the node selects for that flow an available label (VPI/VCI) of 
the mput port on which the packet was received. The node which has made the decision to 
label the flow then stores the label, flow identifier, and a lifetime, and then sends an IFMP 
redirect message upstream to the previous node from which the packet came. The flow 
identifier contains the set of header fields that characterize the flow. The lifetime specifies 
the length of time for which the redirection is valid. Unless the flow state is refreshed the 
association between the flow and label is deleted upon the expiration of the lifetime 
Expiration of the lifetime before the flow state is refreshed results in further packets 
belonging to the flow to be transmitted on the default forwarding channel between the 
adjacent nodes. A flow state is refreshed by sending upstream a redirect message having 
the same label and flow identifier as the original and having another lifetime. The redirect 
message requests the upstream node to transmit all further packets that have matching 
characteristics to those identified in the flow identifier via the virtual channel specified by 
the label. The redirection decision is also a local decision handled by the upstream node, 
whereas the flow classification decision is a local decision handled by the downstream 
node. Accordingly, even if a downstream node requests redirection of a particu.ar flow of 
packets, the upstream node may decide to accept or ignore the request for redirection In 
addition, redirect messages are not acknowledged. Rather, the first packet arriving on the 
new virtual channel serves to indicate that the redirection request has been accepted. 
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Different encapsulations are used for the transmission of IP packets that belong to 
particular labelled flows on an ATM data link, depending on the different flow type of the 
flows. In the present example of Fig. 2, twelve types of encapsulations are used to 
transmit IP packets belonging to the twelve types of flows to be multiplexed over multiport 
interface 18. 

In addition to using IFMP to classify and redirect flows, a system node such 
as a basic switching engine may utilize the General Switch Management Protocol (GSMP, 
also described in detail in U.S. patent application no. 08/597,520) to establish 
communication over the ATM link between the switch controller and ATM hardware 
switching engine of the basic switching unit and thereby enable flow-by-flow multiplexed 
layer 2 switching when possible and layer 3 IP routing and packet forwarding when 
necessary. In particular, GSMP is a general purpose, asymmetric protocol to control the 
switching engine, e.g., the ATM switch. That is, the switch controller acts as the master 
with the ATM switch as the slave. GSMP runs on a virtual channel established at 
initialization across the ATM link between the switch controller and the ATM switch. A 
single switch controller may use multiple instantiations of GSMP over separate virtual 
channels to control multiple ATM switches. Also included in GSMP is a GSMP adjacency 
protocol, which is used to synchronize state across the ATM link between the switch 
controller and the ATM switch, to discover the identity of the entity at the other end of the 
link, and to detect changes in the identity of that entity. 

GSMP allows the switch controller to establish and release connections 
across the ATM switch, add and delete leaves on a point-to-multipoint connection, manage 
switch ports, request configuration information, and request statistics (such as the level of 
traffic on each port). GSMP also allows the ATM switch to inform the switch controller 
of events such as a link going down. In accordance with a specific embodiment of the 
present invention, a switch controller may use GSMP to configure multiport interface 18 
and to direct the switching engine to switch flows on the multiport interface 18 such that 
the switching engine distributes the flows across the individual sub-ports designated by the 
configuration. Creation and deletion of multiport interfaces is done at the switch controller 
with configuration information being stored in, for example, non-volatile memory in the 
switch controller. The bandwidth and performance of the multiport interface approaches 
that of a single interface having a bandwidth equal to the sum of that of the individual sub- 
ports. For example, as seen in Fig. 2, combining four OC3 (155 Mbps) interfaces into a 
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OC.2 (622 Mbps, connection. The mu.upon interface „ is useftll in reli 
congestion that might resnit in tnn* connect between basic Pitching units, such as 
when tramc from many LANs (e.g. , Fas, Ethernets, on one of fl,e basic switching units 
5 nugh, cross a trunk before reaching an important server or server farm. 

Although shown in connection with distributing multiple flows as a 
preferred embodiment, me present invention can also be used to allocate bandwidth across 
muluple pons whete other communication methods are used, In mis situation the 
advantage of being ab,e to use multiple pons invisible to layer 3 of the OSI reference 
1 0 mode, provides advantages by reducing the overhead and time required to do the 
transmissions and reconfigure as necessary. 

Switching engine 12b is assumed to contain multip.e ports, where each 
Physical pon is a combination of an mpn, port and an output port. ATM cefls atrive at the 

1 5 nor, T Z " ~ ** « -°ming virmal channels a, an inpm 

15 pon. and depan from me ATM switeh ,o an externa, communication link on ou, B oina 
vntnal channels from an on«pu, pen. As mendoned earlier, virtual channels on a" pon o, 
hnk ate mferenced by their VPI/VCI. A virmal channel connection across an ATM sw,,ch 
,s fonned by connecfing an incoming vinua, channel (or roo,) ,o one or more ongoing 
vutual channels (or branches). Vtaal channel connecfions are referenced by die "mpn, 
2 0 pon on which drey arrive and me VM/VCI of meir incoming virmal channel !n me 

switeh. each pon has a hardware look-up table indexed by me VPI/VCI of me incoming 

ATM cel.. and entries m me tables are controlled by a local control processor in the 
switch. 

25 H. System Hardware 

A. Controller Hardware 

Fig. 4a is a system block diagram of a typical computer system 151 that 
may be used as switch controller 12a in basic switching unit 12 (as shown in Fig. 1) to 

execute the system software of the present invention Fi* a, „i* •., 

v invention, tig. 4a also illustrates an example of 

30 the computer system ma, may be used as switch gateway controller of switeh ga.ewav uni, 
.21 (as shown in Fig. 3). as wed as setving as an example of a typicai compute, which 
may be used as a host computer/server/workstation loaded with a subsc, of me IFMP 
software. Of course, i, is recognized Una, other elemems such as a moni.or. screen, and 
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keyboard are added for the host. 

As shown in Fig. 4a, computer system 151 includes subsystems such as a 
central processor 169, system memory 171, I/O controller 173, fixed disk 179, network 
interface 181, and read-only memory (ROM) 183. Of course, the computer system 151 
5 optionally includes monitor 153, keyboard 159, display adapter 175, and removable disk 
177, for the host. Arrows such as 185 represent the system bus architecture of computer 
system 151. However, these arrows are illustrative of any interconnection scheme serving 
to link the subsystems. For example, a local bus could be utilized to connect central 
processor 169 to system memory 171 and ROM 183. Configuration information for 

10 creation of multiport interfaces may be stored, for example, on ROM 183. Other 

computer systems suitable for use with the present invention may include additional or 
fewer subsystems. For example, another computer system could include more than one 
processor 169 (i.e., a multi -processor system) or a cache memory. 

In an embodiment of the invention, the computer used as the switch 

1 5 controller can be a standard Intel-based central processing unit (CPU) machine equipped 
with a standard peripheral component interconnect (PCI) bus, as well as with an ATM 
network adapter or network interface card (NIC). The computer is connected to the ATM 
switch via a 155 Megabits per second (Mbps) ATM link using the ATM NIC. In this 
embodiment, the system software is installed on fixed disk 179 which is the hard drive of 

20 the computer. As recognized by those of ordinary skill in the art, the system software may 
be stored on a CD-ROM, floppy disk, tape, or other tangible media that stores computer- 
readable code. 

Computer system 151 shown in Fig. 4a is but an example of a computer 
system suitable for use (as the switch controller of a basic switching unit, as the switch 

2 5 gateway controller of a switch gateway unit, or as a host computer/server/workstation) 
with the present invention. Other configurations of subsystems suitable for use with the 
present invention will be readily apparent to one of ordinary skill in the art. In addition, 
switch gateway unit may be equipped with multiple other NICs to enable connection to 
various types of LANs. Other NICs or alternative adaptors for different types of LAN 

30 backbones may be utilized in switch gateway unit. For example, SMC 10M/100M 
Ethernet NIC or FDDI NIC may be used. 

Without in any way limiting the scope of the invention, Table 1 provides a 
list of commercially available components which are useful in operation of the controller, 
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according to the above embedment, It wil, be apparent to those of skill in the art that the 
consents listed in Tabie 1 are m ere, y representative of those winch may be used T 
association with the inventions herein and are provided for the purpose of facilitating 
assembly of a device in accordance with one particular embodiment of the invention A 
wide variety of components readily known to those of skill in the art cou,d readily be 
substituted or functionality could be combined or separated. 

Table 1 : Controller Cnmnnn.n„ 



10 



15 



Microprocessor 
System memory 

ATM NIC 

Fixed or Hard disk 

Drives 

Power supply 
Chassis 



Intel Pentium 133 MHz processor 

1 6Mbyte RAM/256K cache memory Motherboard 

Intel Endeavor motherboard 

Zeitnet PCI ATM NIC (155 Mbps) 

500Mbyte IDE disk 

standard floppy, CD-ROM drive 

standard power supply 

standard chassis 



20 



25 



30 



35 



B. Switching Hardware 

As discussed above, (he ATM switch hardware provides ,he switching 
engme ,2b of basic swuching uni, ,2. in accordance with a specific embodimen,. The 
ATM swuching engine utifes vendor-independent ATM switching h^. However 
the ATM switching engine according to the present invention does no, re, y on any of its ' 
usnal connection-oriented ATM renting and signaling software (SSCOP Q 293! UNI 
3.CKU and P-NNI). Rather, any ATM protocol and software are completely discarded 
and the bastc switching nnit relies on the system software to create and delete muuipon 

interfacesrand to control the ATM c™it^u;~ 

ntroi tne ATM switching engine for inverse multiplexing of flows The 

system software is described in detail later. 

Separately availaWe ATM components may be assembled into a typical 
ATM switch archftecture. For example, Fig. 4b is a general block diagram of an 
architecture of an ATM switch 12b (the example shows a ,6-port switch, ma, may be used 
as the switching hardware engine of a basic switching nnj, according ,o an embodimen, of 
the present invention. However, commercially available ATM switches also may operate 
as me swuching engine of <he basic swuching uni , according to other embodiments of ,he 
present invemion The main funcona, component of swimhhtg hardware ,2b include a 



SUBSTITUTE SHEET (RULE 26) 



WO 99/23853 

15 

switch core, a microcontroller complex, and a transceiver subassembly. Generally, the 
switch core performs the layer 2 switching, the microcontroller complex provides the 
system control for the ATM switch, and the transceiver subassembly provides for the 
interface and basic transmission and reception of signals from the physical layer. In the 
present example, the switch core is based on the MMC Networks ATMS 2000 ATM 
Switch Chip Set which includes White chip 200, Grey chip 202, MBUF chips 204, Port 
Interface Device (PIF) chips 206, and common data memory 208. The switch core also 
may optionally include VC Activity Detector 210, and Early Packet Discard function 212. 
Packet counters also are included but not shown. White chip 200 provides configuration 
control and status . In addition to communicating with White chip 200 for status and 
control, Grey chip 202 is responsible for direct addressing and data transfer with the 
switch tables. MBUF chips 204 are responsible for movement of cell traffic between PIF 
chips 206 and the common data memory 208. Common data memory 208 is used as cell 
buffering within the switch. PIF chips 206 manage transfer of data between the MBUF 
chips to and from the switch port hardware. VC Activity Detector 210 which includes a 
memory element provides information on every active virtual channel. Early Packet 
Discard 212 provides the ability to discard certain ATM cells as needed. Packet counters 
provide the switch with the ability to count all packets passing all input and output ports. 
Buses 214, 215, 216* 217, and 218 provide the interface between the various components 
of the switch. The microcontroller complex includes a central processing unit (CPU) 230, 
dynamic random access memory (DRAM) 232, read only memory (ROM) 234, flash 
memory 236, DRAM controller 238, Dual Universal Asynchronous Receiver-Transmitter 
(DUART) ports *240 and 242, and external timer 244. CPU 230 acts as the 
microcontroller. ROM 234 acts as the local boot ROM and includes the entire switch code 
image, basic low-level operation system functionality, and diagnostics. DRAM 232 
provides conventional random access memory functions, and DRAM controller 238 (which 
may be implemented by a field programmable gate array (FPGA) device or the like) 
provides refresh control for DRAM 232. Flash memory 236 is accessible by the 
microcontroller for hardware revision control, serial number identification, and various 
control codes for manufacturability and tracking. DUART Ports 240 and 242 are provided 
as interfaces to communications resources for diagnostic, monitoring, and other purposes. 
External timer 244 interrupts CPU 230 as required. Transceiver subassembly includes 
physical interface devices 246, located between PIF chips 206 and physical transceivers 
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(not shown). Interface devices 246 perform processing of the data stream, and implement 
the ATM physical layer. Of course, the components of the switch may be on a printed 
cxrcuit board that may reside on a rack for mounting or for setting on a desktop, depend™ 
on the chassis that may be used. 

Without in any way limiting the scope of the invention, Table 2 provides a 
hst of commercially available components which are useful in operation of the switching 
engine, according to the above specific embodiment. It will be apparent to those of skill in 
the art that the components listed in Table 2 are merely representative of those which may 
be used in association with the inventions herein and are provided for the purpose of 
facilitating assembly of a device in accordance with a particular embodiment of the 
mvention. A wide variety of components or available switches readily known to those of 
skill in the art could readily be substituted or functionality could be combined or separated. 

Table 2: Switch Comp onent.: 



CORE 



SWITCH 



Core chip set 

Common data memory 
Packet counters 



CPU 

DRAM 

ROM 

4 

Flash memory 
DRAM controller 
DUART 
External timer 



Physical interface 



MMC Networks ATMS 2000 ATM Switch Chip Set 
(White chip, Grey chip, MBUF chips, PrF chips) 
standard memory modules 
standard counters 

MICROCONTROLLER COMPLEX 

Intel 960CA/CF/HX 
standard DRAM modules 
standard ROM 
standard flash memory 
standard FPGA, ASIC, etc 
16552 DUART 
standard timer 

TRANSCEIVER SUBASSEMBLY 
PMC-Sierra PM5346 



111 • System Software Functionality 

As generally described above in accordance with the specific embodiment. 
IFMP is a protocol for instructing an adjacent node to attached a layer 2 "label" to a 
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system node that is a basic switching unit performs step 280. Other system nodes (e.g., 
switch gateway unit or host) operate as shown in Fig. 5b but do not perform step 280 sina 
the result of step 278 is no for a switch gateway unit or a host (as these types of system 
nodes have no downstream link). 

Fig. 6a is a diagram generally illustrating the steps involved in labelling a 
flow in the upstream link of a system node, such as shown by label flow step 274 of Fig. 
5b. For a system node that is a switch gateway unit or a host, the system node labels a 
flow as shown in steps 290, 292, 300 and 276 of Fig. 6a. When the label flow step begins 
(step 290), the system node selects a free label x on the upstream link in step 292. The 
system node then sends an IFMP redirect message on the upstream link in step 300 (as 
indicated by dotted line 293). The system node then forwards the packet in step 276. For 
a system node that is a basic switching unit, labelling a flow is also illustrated by steps 
294, 296, and 298. When the label flow step begins (step 290), the basic switching unit 
selects a free label x on the upstream link in step 292. The switch controller of basic 
switching unit then selects a temporary label x' on the control port of the switch controller 
in step 294. At step 296, the switch controller then sends to the hardware switching 
engine a GSMP message to map label x on the upstream link to label x' on the control 
port. The switch controller then waits in step 298 until a GSMP acknowledge message is 
received from the hardware switching engine that indicates that the mapping is successful. 
Upon receiving acknowledgement, the basic switching unit sends an IFMP redirect 
message on the upstream link in step 300. After step 300, the system node returns to step 
176 as shown in Fig. 5b. 

Fig-. 6b is a diagram generally illustrating the steps involved in switching a 
flow in a basic switching unit, such as shown by switch flow step 280 of Fig. 5b. As 
mentioned above, only system nodes that are basic switching units may perform the switch 
flow step. When the switch flow procedure starts in step 310, the switch controller in the 
basic switching unit sends at step 312 a GSMP message to map label x on the upstream 
link to the label y on the downstream link. Label y is the label which the node 
downstream to the basic switching unit has assigned to the flow. Of course, this 
downstream node has labelled the flow in the manner specified by Figs. 5b and 6a, with 
the free label y being selected in step 292. After step 312, the switch controller in the 
basic switching unit waits in step 314 for a GSMP acknowledge message from a hardware 
switching engine in basic switching unit to indicate that the mapping is successful. The 
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downstream and switch controller 12a of the upstream basic switching unit 12 in step 396 
determines that the flow label is for a multiport interface, then the upstream basic 
switching unit 12 switches the flow for the designated multiport interface in step 280. After 
accordingly switching the flow using the appropriate label determined downstream (for a 
5 particular port if the flow label is not for a multiport interface, or for a particular sub-port 
if the flow label is for a multiport interface) in step 280, the basic switching unit 12 at step 
276 forwards the packet downstream. 

When a packet is forwarded or switched from an upstream node and 
received at the downstream node, the downstream node proceeds to forward the packet 
10 traffic on the chosen label from all sub-ports of the multiport interface. 

As seen in the above description for a specific embodiment, inverse 
multiplexing across sub-ports of a multiport interface may be accomplished on a flow-by- 
flow basis with the net result being that traffic is distributed fairly evenly across the sub- 
ports. 

15 In accordance with another specific embodiment, inverse multiplexing 

across sub-ports of a multiport interface may be accomplished on a flow-by-flow basis in 
another manner with the net result being that traffic is distributed across the sub-ports such 
that the flow is not balanced across all the specified sub-ports but instead the current sub- 
port is desired to be fully loaded with flows before adding a flow to the next sub-port. In 

2 0 particular, the present specific embodiment is achieved by using weighted multiport 

interfaces, which are configured in a similar manner as multiport interfaces. It is noted 
that both the upstream and downstream basic switching units are each configured for the 
same weighted multiport interfaces. Configuring a weighted multiport interface may be 
achieved in the specific embodiment by a command to define the weighted multiport 

2 5 interface. Specifically, the switch controller of the upstream basic switching unit (Fig. 1) 
defines a weighted multiport interface by a command (e.g., define wmpif 8 9 10 
12) which designates, in this example, ports 8, 9* 10 and 12 to be sub-ports of the 
weighted multiport interface. Similarly, managing the weighted multiport interface may be 
achieved with other commands to show the weighted multiport interface, and to delete the 

4 

30 weighted multiport interface, In this example, when sub-port 8 of the weighted multiport 
interface is running at full line rate, then sub-port 9 will be opened for queuing of flows. 
When sub-port 9 is running at the full line rate, then sub-port 10 will be opened for 
queuing of flows, and so forth. 
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According to yet another specific embodiment, the use of virtual path 
interfaces (VPIs) may be combined with multiport interfaces to provide extremely high 
bandwidth tunneled across a wide area network (WAN). More specifically, this may be 
accomplished by configuring a virtual path interface that refers to a multiport interface as 
the carrier of the virtual path provided by the WAN . It is noted that both the upstream and 
downstream basic switching units are configured for the same virtual path interface 
referring to the same multiport interface. In accordance with the specific embodiment of 
the present invention, a multiport interface is first created and the. basic switching unit 
configured, in a similar manner as described for Fig. 7a. As an example, a multiport 
interface is created by the command (e.g., define mpif 8 9 10 11) which 
designates, in this example, ports 8, 9, 10 arid 11 to be sub-ports of multiport interface 
numbered as port 8. After receiving the command to define the multiport interface, the 
switching engine may return an acknowledgment message indicating the successful 
definition of the multiport interface (e.g., multiport interface 8 
successfully defined). Then, a virtual path interface is configured in this specific 
embodiment by a command to define the virtual path interface. Specifically, the switch 
controller of the upstream basic switching unit (Fig. 1) defines a virtual path interface 17 
by a command (e.g., define vpif 8 5 17) which designates, in this example, a 
virtual path interface 5 on the ports 8, 9, 10 and 11 (sub-ports of the multiport interface 
numbered as port 8). The above commands combine the virtual path interface 5 on ports 
8-11 into a virtual path interface numbered 17. In the present specific embodiment, the 
same VPI number should be used on all sub-ports of the created multiport interface to 
carry traffic across the WAN. Managing the virtual path interfaces may be achieved with 
other commands to show the virtual interface, and to delete the virtual path interface. 
After receiving the command to define the virtual path interface combined with the 
multiport interface, the switching engine may return an acknowledgment message 
indicating the successful definition of the virtual path interface (e.g., virtual path 
interface 17 on port 8 vpi 5 successfully defined). Configuration 
of a virtual path interface combined with a multiport interface requires that the switching 
engine and the switch controller re-initialize (e.g., by rebooting) their communication in 
order to exchange the new list of available interfaces. 

In accordance with a specific embodiment of the present invention, the 
source code of the system software (© Copyright, Unpublished Work, Ipsilon Networks, 
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Inc., All Rights Reserved) for use on the switch controller of a basic switching unit is 
included as Appendix I. In particular, Appendix I includes the system software for 
configuration and operation of multiport interfaces, flow characterization and direction on 
sub-ports, interfacing with IFMP and GSMP protocols, routing and forwarding, device 
drivers, operating system interfaces, as well as drivers and modules. 



IV. Conclusion 

The inventions claimed herein provide an improved method and apparatus 
for transmitting packets over a network by multiplexing IP switched flows over a multiport 
interface between basic switching units. It is to be understood that the above description is 
intended to be illustrative and not restrictive. Many embodiments will be apparent to those 
of skill in the art upon reviewing the above description. By way of example the inventions 
herein have been illustrated primarily with regard to transmission of IP packets capable of 
carrying voice, video, image, facsimile, and data signals, but they are not so limited. By 
way of further example, the invention has been illustrated in conjunction with specific 
components and operating speeds, but the invention is not so limited. The scope of the 
inventions should, therefore, be determined not with reference to the above description, 
but should instead be determined with reference to the appended claims, along with the full 
scope of equivalents to which such claims are entitled, by one of ordinary skill in the art. 
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WHAT IS CLAIMED IS : 

1 LA method for transmitting packets over a multiport interface between an 

2 upstream node and a downstream node in a network, said downstream node being 

3 downstream from said upstream node, said method comprising the steps of: 

A establishing a multiport interface comprising a plurality of sub-ports 

5 between said upstream node and said downstream node; 

6 receiving a packet at said downstream node; 

7 performing a flow classification at said downstream node on said packet to 

8 determine whether said packet belongs to a specified flow that should be redirected in the 

9 upstream node to said multiport interface; 

10 selecting a free label for one of said plurality of sub-ports at said 

11 downstream node; 

12 informing said upstream node that future packets belonging to said specified 

1 3 flow should be sent with said selected free label attached. 

1 2. The method of claim 1 wherein said upstream and downstream nodes use 

2 ATM. 

1 3. The method of claim 2 wherein said free label comprises a VPI/VCL 

1 4. The method of claim 1 wherein said network comprises a local area 

2 computer network! 

1 5. The method of claim 1 wherein said network comprises a wide area network 

2 (WAN). 

1 6. The method of claim 1 further comprising the step of: 

2 determining said one of said plurality of sub-ports for which said free label 

3 is selected. 



1 

2 



7. The method of claim 6 wherein a lowest numbered sub-port of said plurality 
of sub-ports is selected as said one of said plurality of sub-ports when all of said plurality 
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of sub-ports are running at equal outgoing cell rates 



8. The method of claim 6 wherein said one of said plurality of sub-ports has 
the fewest outgoing cell rate of all of said plurality of sub-ports. 

9. The method of claim 6 wherein said one of said plurality of sub-ports has 
the shortest queue of waiting traffic at a given priority level among all of said plurality of 
sub-ports. 

10. The method of claim 6 wherein said determining step includes ensuring that 
said one of said plurality of sub-ports is not a sub-port experiencing failure. 

11. The method of claim 6 wherein said informing step is performed by IFMP 
software that enables communication between said upstream and downstream node, and 
said determining step uses GSMP software. 

12. The method of claim 6 wherein said one of said plurality of sub-ports is not 
yet fully loaded with flows and each of the remaining of said plurality of sub-ports either is 
fully loaded with flows or is loaded with no flows. 

13. The method of claim 6 further comprising the step of: 

configuring a virtual path interface that refers to said multiport interface, 
* 

and wherein each of said plurality of sub-ports uses said virtual path interface. 

14. A computer program product that enables dynamic shifting between routing 
and switching in a network having an upstream node and a downstream node downstream 
from said upstream node, said computer program product comprising: 

computer-readable code that establishes a multiport interface comprising a 
plurality of sub-ports between said upstream node and said downstream node; 

computer-readable code that performs a flow classification on a packet at 
said downstream node to determine whether said packet belongs to a specified flow that 
should be redirected in said upstream node to said multiport interface; 

computer-readable code that selects a free label for one of said plurality of 
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10 sub-ports at said downstream node; 

1 1 computer-readable code that informs said upstream node that future packets 

12 belonging to said specified flow should be sent with said selected first free label attached; 

13 and 

14 a tangible medium that stores the computer-readable codes. 

1 15. The computer program product of claim 14, wherein said tangible media 

2 comprises a hard disk on a computer. 

1 16. The computer program product of claim 14, wherein said tangible media is 

2 selected from a group consisting of CD-ROM, tape, floppy disk, and the like. 

1 17. The computer program product of claim 14 wherein said computer-readable 

2 codes are installed on a computer attached to a switching hardware engine. 

■ 

1 18. The computer program product of claim 17 wherein said computer attached 

2 to a switching hardware engine is an IP switched router. 

1 19. The computer program product of claim 18 wherein said switching 

2 hardware engine utilizes asynchronous transfer mode (ATM) switching technology. 

1 20. The computer program product of claim 19 wherein said flow classification 

2 uses VPI/VCI as labels. 

1 21. The computer program product of claim 19 wherein said switching hardware 

2 engine utilizes a ' switching technology selected from a group consisting of FDDI, Ethernet, 

3 Fast Ethernet, Gigabit Ethernet, frame relay, and fast packet switching. 
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